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ABSTRACT 

We examine Cosmic Microwave Background (CMB) temperature power spectra from the 
BOOMERANG, MAXIMA, and DASI experiments. We non-parametrically estimate the true power 
spectrum with no model assumptions. This is a significant departure from previous research which used 
either cosmological models or some other parameterized form (e.g. parabolic fits). Our non-parametric 
estimate is practically indistinguishable from the best fit cosmological model, thus lending independent 
support to the underlying physics that governs these models. We also generate a confidence set for the 
non-parametric fit and extract confidence intervals for the numbers, locations, and heights of peaks and 
the successive peak-to-peak height ratios. At the 95%, 68%, and 40% confidence levels, we find functions 
that fit the data with one, two, and three peaks respectively (0 < i < 1100). Therefore, the current 
data prefer two peaks at the lcr level. However, we also rule out a constant temperature function at the 
> 8cr level. If we assume that there are three peaks in the data, we find their locations to be within £\ 
= (118,300), ii = (377,650), and £3 = (597,900). We find the ratio of the first peak-height to the second 
= (1-06,4.27) and the second to the third (^ff) 2 = (0.41,2.5). All measurements are for 95% 
confidence. If the standard errors on the temperature measurements were reduced to a third of what 
they are currently, as we expect to be achieved by the MAP and Planck CMB experiments, we could 
eliminate two-peak models at the 95% confidence limit. The non-parametric methodology discussed in 
this paper has many astrophysical applications. 

1. INTRODUCTION 

There has been growing evidence for the existence of 
peaks and valleys in the temperature power spectrum 
of the CMB. From a theoretical standpoint, such fea¬ 
tures are a direct result of the physics in the primordial 
photon-electron plasma, predicted by gravitational insta¬ 
bility models of structure formation (Peebles & Yu 1970; 

Hu and Sugiyama 1995). These features are important for 
constraining the cosmology of our Universe. For instance, 
in many models, the ratio of the height of the first peak to 
the second peak is dependent on the spectral tilt, n s and 
the baryon fraction, flbaryons/^matter- The ratio of the 
third peak to the second peak is dependent on Q m atterh 2 
and n s (see Hu et al. 2001 for further discussion). 

Most often in the literature, the CMB power spectra 
are fit to a suite of cosmological models (Tegmark et 
al., 1999,2000; Jaffe et al. 2001). These physical mod¬ 
els are well-motivated and sophisticated, but they contain 
many free parameters {e.g., eleven in the work of Wang, 

Tegmark, and Zaldarriaga 2001- WTZ), some of which 
are unknown (ionization depth, contribution from gravity 
waves) or degenerate (e.g. see Efstathiou 2001). Typically, 
some sort of likelihood analysis is performed to determine 
which cosmological model best fits the data. 

There is however, another approach: place constraints 
on the features of the power spectrum and use these fea¬ 
tures to determine the cosmological parameters. The as¬ 
sumptions here are that the peaks and valleys are best 
described by the broad range of cosmological models (as 
in Hu et al.) or by parabolas or some other chosen function 


(as in Knox & Page 2000 and de Bernardis et al. 2001). 
A potential problem in all of these approaches is that it 
is difficult to get valid statistical confidence intervals (see 
Section 2). There is also the concern that the fitted fea¬ 
tures may be artifacts from the multitude of assumptions. 

In this paper, we take what may be considered a more 
conservative approach: we make no assumptions whatso¬ 
ever about the true underlying function. Our new sta¬ 
tistical technique is non-parametric and allows for valid 
confidence intervals to be measured for peak characteris¬ 
tics. One theme of our work is that confidence intervals 
for any quantity of interest can be extracted from a con¬ 
fidence set for the unknown spectrum. These techniques 
for fitting and inference are applicable to a wide variety of 
astrophysical data-analysis problems. 

2. OVERVIEW OF NONPARAMETRIC ANALYSIS 

In general, non-parametric statistical methods estimate 
functions without imposing a finite-dimensional paramet¬ 
ric form. The resulting estimates are obtained by care¬ 
fully smoothing the data to balance bias and variance. See 
Hastie and Tibshirani (1990) for details and examples. 

The CMB data, after suitable preprocessing, take the 
form (Xi, Yi),..., (X„, Y n ) where W is the multipole mo¬ 
ment (usually denoted £) ordered according to increasing 
X , and Yi is the estimated power spectrum at Xj (usually 
denoted Ci with some constants). Let /(Xj) be the true 
power spectrum at X^. Then, 

Yi = f(X z ) + e; (1) 

where et is the error in estimating /(X^). We require that 
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the £j are uncorrelated, zero-mean Gaussians. Therefore, 
we take the uncorrelated statistical errors as given by the 
experiments. There are additional, correlated noise terms 
in all of the experiments due to calibration, beam width 
and pointing uncertainties, which were not used in our 
analysis. The magnitude of these correlated errors is typ¬ 
ically 5% - 20%, although the Boomerang effective beam 
uncertainties can be higher for £ > 600. The method we 
use in this work can be generalized to include correlated 
errors (Genovese et al., in preparation). We assume that 
/ 2 is integrable; otherwise, we make no further assump¬ 
tions. Our non-parametric technique yields a vector f that 
estimates the vector f = (f(Xi),...,f(X n )) of the true 
spectrum’s values at the A,;S (see Figure 1 top). 

After we perform the fit, we need to quantify the uncer¬ 
tainty to make any inferences. We begin by constructing a 
set of vectors C n from the data that traps the true power 
spectrum, f, with probability 1 — a, where 1 — a is a pre¬ 
specified confidence level. With C n we can derive a con¬ 
fidence interval for any quantity of interest. Consider, for 
example, the number of peaks. For any vector f £ C n , de¬ 
fine P(f) to be the number of peaks in f. Then, because C n 
contains the true spectrum with confidence 1 —a, the range 
on the measured peaks is (minfgc^ P(f), maxfgcy P(f)). 

We can find confidence intervals for the locations and 
heights of peaks in a similar way. There are important 
advantages for these confidence measurements over “stan¬ 
dard” x 2 or even Bayesian techniques, which we discuss 
further in the next section. 

3. METHODOLOGY 

Refer to the model in equation (if). Let the functions 
(f>i, </> 2 ,... be an orthonormal basis over the range of the 
observed Ays. The choice of basis is somewhat important. 

For instance, if we were fitting to a galaxy spectrum, with 
highly peaked emission lines on top of a smooth, broad 
continuum, a wavelet basis would allow for the simulta¬ 
neous fitting of broad and narrow features. On the other 
hand, a cosine basis would require large amplitude high 
frequency terms to match the emission lines. This might 
cause the continuum fit to be wiggly. In our work, we use 
a discrete cosine basis, = 1 , <$> 2 {x) = V2cos(irx), 

<t> 3 ( or ) = \Z2cos(2ttx), . .., since it has well determined 
properties for confidence limits. For simplicity in the 
derivation, we assume that the variance (second moment 
about the mean) of each €i is the same value a 2 . This is 
not necessary in practice, nor was it assumed in the full 
analysis. 

Any square integrable function / can be expanded as 
fix) = YjLi Pjfijix)- Estimating / amounts to estimat¬ 
ing the (3j’s. If we have chosen a good basis for repre¬ 
senting /, the higher-order terms in this series will tend 
to decay rapidly. Hence, we can approximate the infinite 
sum with finite sum f{x) ss i Pj<t>j(x)- So, to estimate 
the underlying function, we need to find the /3s. 

Let Zj = TV -1 / 2 YliLi YifaiXi), for j = 1,..., N. Based 
on the theory in Beran (2000), we take N = n. This 
choice ensures that the estimate of / is optimal and that 
the resulting confidence intervals remain valid. It can be 
shown that each statistic Zj has approximately a Gaus- 

1 Strictly speaking, we should use a limsup, which refers to the largest 
can be obtained as the sample size gets large 


sian distribution with mean 9j = y/nflj and variance cr 2 . 
This re-parameterization means that we need to estimate 
9 = {9 1 , ..., 9 n ). Given an estimate 9 of 9, we can estimate 
f(Xi) via f{Xi) = (1/v^) £"=i e^jiXi). 

We could use Zj as an estimate of 9 :) . but this yields a 
poor estimate of / because it is too variable. A smoother 
estimate can be obtained by damping the higher frequency 
terms in the expansion. In statistics, this is called a 
“shrinkage estimator”. We consider shrinkage estimators 
of the form 9 = (71 Z 1,72 Z 2l • • •, 7 n Z n ), where 1 > 71 > 
72 > • • • > 7 n > 0 are called the shrinkage coefficients. 
The smaller 7 j, the smaller the contribution of <f>j in the 
estimating expansion for /. With the cosine basis, for 
example, such shrinkage estimators damp down the con¬ 
tribution from high-frequency terms. 

Every choice of 7 = ( 71 ,... , 7 n ) gives an estimate 9 7 
which then yields an estimate / 7 of the function /. We 
would like to choose 7 to minimize the mean squared error, 
MSE(q) = (f(f(x) — / 7 (a;)) 2 d:r). Unfortunately MSE(y) 
is unknown, because it depends on the true /, but it can 
be estimated by Stein’s Unbiased Risk Estimator (Stein 
1981): 

MSE( 7 ) = ]T [u 2 7 2 + (Z 2 - cr 2 ) (1 - 7 ,-) 2 ] . (2) 


We use the Pooled Adjacent Values (PAV) algorithm 
(Robertson, Wright, and Dykstra, 1988) to minimize 
MSE(y) as a function of 7 while maintaining the order¬ 
ing constraint on 7 . The minimizer is denoted 7 and the 
final estimate is therefore 9 = (77 Zi,j 2 Z 2 , ■ ■ ■ ,”/ n Z n ). 

Next we construct a confidence set, C n for the vec¬ 
tor of function values at the observed data, f„ = 
(f(X 1 ),f(X 2 ),---,f(X n )). Throughout the paper we 
have said that the confidence sets are “valid.” Formally, 
what this means is that for any c > 0 , 


lim ] 


sup |Pr(f„ e C n ) - (1 - a)| —» 0 (3) 


as n —* 00 , where the operation ||a|| denotes 7 /n -1 £ i a 2 . 
This means that for large n, the confidence set traps the 
values of the true function with probability very close to 
1 — a. The confidence set is an ellipse of the form 


C n = ^ 9 : n 1 YjOj — 9j) 2 < MSE(x) + n 1 ^ 2 tz (x 

>) 

where z a is the number that has probability a to the right 
under a standard Gaussian. For instance, if we choose 95% 
confidence (a = 0.05), then z a = 1.645, while for 67% con¬ 
fidence (a = 0.33), the confidence “radius” is smaller with 
z a = 0.44. Here, r is defined similar to Beran (2000): 

r 2 = 4 a 2 - ct2 )(1 -%) 2 + 2 t 4 - l) 2 - (5) 

3 3 

difference between the true coverage and the claimed coverage that 
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We can in turn write the confidence set for f as 

n \ 

f : f = n -1 / 2 ^ 9j(f>j and 9 G C n > . ( 6 ) 

i=i J 

Here, the true power spectrum f = (f(Xi),...,f(X n )) is 
estimated at each of the original data points. Prior infor¬ 
mation about f of the form f € V n allows us to replace V n 
with T> n n V n while maintaining a 1 — a confidence level. 
In particular, we take V n to be the set of vectors f corre¬ 
sponding to spectra with zero to three peaks for £ < 1100 . 
From the confidence set T> n \ TP n , we can derive confidence 
sets for any interesting feature of f. 

A key point is that the resulting intervals on any mea¬ 
sured quantity are simultaneously valid , meaning that all 
of the intervals contain the corresponding true quantity 
with probability 1 — a. In contrast, deriving 1 — a confi¬ 
dence intervals from a collection of individual chi-squares 
does not obtain 1 —a simultaneous coverage, but often sub¬ 
stantially lower coverage. For example, a common tech¬ 
nique to determine the 95% confidence range for a specific 
parameter is to “marginalize” over the other parameters 
(see Tegmark et al. 1999,2000). Such a technique will pro¬ 
vide full coverage for that one parameter. However, when 
parameter ranges are combined, the confidence is lower 
than 95%. Bayesian intervals derived from a posterior dis¬ 
tribution suffer from a similar problem in that the long-run 
frequency that the interval contains the true quantity may 
be much less than 1 — a. 

4. RESULTS 

Figure 1 shows the combined data from BOOMERANG, 
MAXIMA, and DASI (Halverson et al. 2001; Lee et al. 
2001; and Netterfield et al. 2001). The bottom panel 
compares our non-parametric fit to the the fit of WTZ 
who use more experiments than the three examined here. 
Recall, our fit requires no assumptions about the data or 
the underlying cosmology. The WTZ fit, on the other 
hand, requires an 11 dimensional parameter space, with 
numerous prior assumptions placed on those parameters 
(for example, the Hubble constant is constrained to 72 ± 8 
km s _ 1 Mpc _1 ). The agreement between the two fits is 
very good, considering the difference in methodology. 

The power of this technique lies in the ability to make 
quantitative statements about the true function with some 
specified confidence. First, we checked the 95% confidence 
set and find that every function within this set has at least 
one peak. Specifically, we set a = 0.05 and determined 
the “radius” of this confidence ellipse (ie. the right side 
of the inequality in Eq [|). We then searched all possi¬ 
ble functions with zero to three peaks over the specified 
range in l to see if the condition in Eq [| was met. Our 
definition of one peak requires the sorted data (according 
to increasing £) to have a section with increasing temper¬ 
ature followed by a section with decreasing temperature. 
For a = 0.05, no zero-peaked functions met this condition. 
Our set of zero-peaked functions includes those with con¬ 
stant temperature as well as those with either increasing 
temperature or decreasing temperature, but not both. We 
rule out functional forms with zero peaks at the level 95% 
confidence level. We also compared specifically against the 
best fit constant function for the power spectrum. We rule 


out a flat temperature spectrum (based on the weighted 
average of the data) at > 8 cr confidence. We perform the 
same analysis at the 68 % confidence level (e.g. a = 0.32 
in Eq |j). The 68 % confidence set rules out all single-peak 
functions. So at the one sigma level, we have found at least 
two peaks in the data. Only in the 40% (a = 0.6) confi¬ 
dence set can we rule out two-peak functions. From our 
confidence sets, the data supports two peaks out of three 
peaks in the CMB power spectrum (for 0 < i < 1100). 

We calculate ranges for the peak heights and locations in 
Table 1. Figure 2 shows the non-parametric fit with confi¬ 
dence intervals for peak heights and locations at 95% con¬ 
fidence. Finally, we computed confidence intervals for the 
ratios of successive peaks under a three-peak model. The 
95% confidence interval for the ratio of the height of the 
first peak to the height of the second peak is (1.06,4.27). 
The 95% confidence interval for the ratio of the height 
of the second peak to the height of the third peak is 
(0.41,2.5). This rules out equal heights for the first two 
peaks at the 95% level. These results are consistent with 
Hu et al. (2001) who find much stronger constraints on 
the height-height ratios by fitting to cosmological models. 

5. DISCUSSION AND CONCLUSIONS 

We present an application of a new and powerful non- 
parametric technique to CMB temperature data. Past ap¬ 
proaches were based on complicated cosmological models 
or parameterized forms. There is superb visual agreement 
between the non-parametric fit and the best fitting cosmo¬ 
logical model. Quantitatively, we provide constraints on 
the peak locations, heights, and height ratios of the power 
spectrum. These constraints can be used to place corre¬ 
sponding limits on the cosmological parameters that they 
describe. For instance, Hu et al. (2001) derive relation¬ 
ships which could in principle, be used for this purpose. 

At the 2cr confidence level, we find at least one peak 
in the current CMB power spectra data, while at the ler 
level, we find two or more peaks. Only for a very low con¬ 
fidence, 40%, can we rule out two peak functions. There¬ 
fore, the data do not yet show the three expected peaks 
for £ < 1100 (in the three CMB datasets examined here). 
There are two explanations for this: the model is right, 
but there is insufficient precision in the current data, or 
the model is wrong. If in fact the errors on the current 
measurements are simply too large, then these standard 
errors would have to be reduced to one-third of their cur¬ 
rent values to rule out a two-peak spectrum at the 95% 
confidence level. This suggests a range for the maximum 
required errors for future CMB experiments (via MAP and 
Planck) to “discover” three peaks in the CMB spectrum. 

We point out that the lack of assumptions used to arrive 
at our best fit is conservative. On the other hand, results 
from fitting assumed cosmological models are optimistic, 
since those models all have a multi-peaked spectra (e.g. Hu 
et al. 2001). While the physical underpinnings for cosmo¬ 
logical models are well founded, the last 50 years (or even 
five) have seen radical changes in those models which best 
fit the data. Therefore, a method to describe the CMB 
that is “cosmology free” has scientific value. Finally, we 
note that the methods described here can be applied to 
the many astrophysical problems that are not well suited 
for standard parametric techniques. 
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During the refereeing process of our paper, two related 
papers came to our attention (Durrer, Novosyadlyj & 
Apunevych 2001; Douspis & Ferreira 2001). These papers 


perform a model-independent measurement of the CMB 
power spectrum but they are not non-parametric estimates 
of the CMB acoustic peaks, as discussed herein, since they 
use phenomenological models to describe the underlying 
power spectrum. It is interesting to note however, that 
all three analyses find low statistical significances for the 
detection of the second and third peaks. We await higher 
precision measurements of the CMB power spectrum to se¬ 
cure the detection, location, shape and amplitude of these 
peaks. 
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Table 1 

Three Peak Confidence Intervals 


Peak 

Location 

Height 

1 

(118,300) 

(4361,8055) 

2 

(377,650) 

(1829,4798) 

3 

(597,900) 

(1829,4688) 
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Fig. 1.— The top panel shows the raw CMB data from the Boomerang, MAXIMA, and DASI experiments. The middle panel is the raw 
CMB data over-plotted with the best fit using the non-parametric technique. The bottom panel shows the non-parametric fit against the 
best cosmological model fit from Wang, Tegmark, and Zaldarriaga (2001). 
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Fig. 2.— The ranges on the locations and peak heights of the non-parametric three peak (and two dip) fit function (95% confidence). 
























